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We present an implementation of the GW approximation for the electronic self-energy within 
the full-potential linearized augmented-plane-wave (FLAPW) method. The algorithm uses an all- 
electron mixed product basis for the representation of response matrices and related quantities. This 
basis is derived from the FLAPW basis and is exact for wave-function products. The correlation 
part of the self-energy is calculated on the imaginary frequency axis with a subsequent analytic 
continuation to the real axis. As an alternative we can perform the frequency convolution of the 
Green function G and the dynamically screened Coulomb interaction W explicitly by a contour 
integration. The singularity of the bare and screened interaction potentials gives rise to a numerically 
important self-energy contribution, which we treat analytically to achieve good convergence with 
respect to the k-point sampling. As numerical realizations of the GW approximation typically suffer 
from the high computational expense required for the evaluation of the nonlocal and frequency- 
dependent self-energy, we demonstrate how the algorithm can be made very efficient by exploiting 
spatial and time-reversal symmetry as well as by applying an optimization of the mixed product 
basis that retains only the numerically important contributions of the electron-electron interaction. 
This optimization step reduces the basis size without compromising the accuracy and accelerates the 
code considerably. Furthermore, we demonstrate that one can employ an extrapolar approximation 
for high-lying states to reduce the number of empty states that must be taken into account explicitly 
in the construction of the polarization function and the self-energy. We show convergence tests, CPU 
timings, and results for prototype semiconductors and insulators as well as ferromagnetic nickel. 
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I. INTRODUCTION 

Many-body perturbation theory with the GW ap- 
proximation for the electronic self-energy offers a well- 
established approach for the computation of excited elec- 
tronic states from first principlesii In principle, the elec- 
tronic self-energy incorporates all many-body exchange 
and correlation effects beyond the Hartree theory. The 
GW approximation contains the electronic exchange ex- 
actly while the screening is treated at the level of 
the random-phase approximation)^ where noninteracting 
electron-hole ring diagrams are summed to all orders. 
This makes the GW approximation particularly suited 
for weakly to moderately correlated systems. 

After its theoretical foundation by Hedin^ in 1965 it 
was not before the middle of the 1980s that the compu- 
tational treatment of real materials became feasible. In 
spite of approximations in the numerical treatment that 
were necessary due to the lack of computer power, the 
first results were very promising. In these works^^"— it 
could be shown that the theoretical band gap fell within 
a margin of 0.1 eV from the experimental value for co- 
valently bonded semiconductors. After these pioneer- 
ing studies, the GW approximation has evolved into the 
method of choice for calculating electronic excitations in 
solid-state systems. 

So far, most codes still rely on the pseudopotential ap- 



proximation, which restricts the range of materials that 
can be examined. Transition-metal compounds and ox- 
ides, in particular, cannot be treated efficiently in this 
approach. Two early all-electron calculations using the 
GW approximation were carried out by Hamada et al^ 
for Si and by Aryasetiawan^i for Ni, both within the lin- 
earized augmented-plaire-wave (LAPW) method. How- 
ever, only very recently were further all-electron imple- 
mentations reported, based on the full-potential LAPW 
(FLAPW) (Refs.[l3and[llL the linearized muffin-tin or- 
bital (LMTO) (Refs. [l3-[l4l) . the projector-augmented- 
wave (PAW) (Refs. [iSl-flTl) . and the Korringa-Kohn- 
Rostoker method (Refll^ together with applications to 
a larger variety of systems. 

While the calculation of GW band structures for small 
systems has become routine, the scientific community is 
increasingly interested in larger and more complex sys- 
tems, such as multicomponent materials, artificial het- 
erostructures, defects, interfaces, surfaces, clusters, and 
nanowires. In codes using periodic boundary conditions, 
such systems must be treated in supercell geometries of- 
ten exceeding 100 atoms. The main obstacle in applying 
the GW approximation in supercell calculations is the 
considerable demand of computation time and memory. 
This is especially true for all-electron methods, where 
the rapid oscillations close to the atomic nuclei make 
the usage of the fast Fourier transformation impossible. 
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Therefore, all-electron GW calculations for large systems 
have so far been prohibitive. In this paper, we describe 
numerical algorithms and approximations that make all- 
electron GW implementations efficient and bring large 
supercell calculations into reach. 

The nonlocality of the self-energy operator is the main 
reason for the large computational effort needed in GW 
calculations. It leads to convolutions in reciprocal space, 
i.e., summations over the Brillouin zone (BZ), which is 
sampled by a finite set of k points. Furthermore, the 
bare electron-electron interaction diverges at the center 
of the BZ, giving rise to a corresponding divergence and 
an anisotropy in the dynamically screened interaction at 
k = 0. A thorough treatment of the F point is hence cru- 
cial for an accurate and efficient BZ summation. Previ- 
ous all-electron implementations^^'^S of many-body per- 
turbation theory have often bypassed this problem by re- 
verting to a plane-wave basis for the Coulomb potential 
and related propagators such as the dielectric function 
but such a projection leads to a loss of accuracy because 
it cannot resolve the rapid oscillations of the orbitals close 
to the atomic nuclei without a prohibitively large basis 
set. As a consequence, physical effects such as core po- 
larization are inadequately described. In an alternative 
approach, the so-called offset-F method, an auxiliary k- 
point mesh that is shifted from the origin by a small but 
finite amount is employed In this way, the singular- 
ity is avoided but the use of additional meshes increases 
the numerical cost; even in the most favorable case, for 
cubic symmetry, the number of k points must at least 
be doubled. Furthermore, the convergence of BZ inte- 
grals involving the Coulomb matrix, for example, for the 
GW self-energy, may be slow with respect to the k-point 
sampling due to the approximate treatment of the quan- 
titatively important region near the zone center. In this 
work, we take the F point explicitly into account and em- 
ploy an analytical treatment of the Coulomb singularity. 
We use a recently developed procedure^ to transform 
the all-electron basis for the interaction potentials, the 
so-called mixed product basis,— in such a way that the 
divergence is restricted to a single matrix element, which 
allows a treatment similar to a pure plane- wave basis set. 

The numerical cost of the BZ convolutions can be 
reduced considerably by employing spatial and time- 
reversal symmetries. Not only can the k dependence of 
response quantities be confined to the irreducible wedge 
of the BZ but the also convolutions over the reciprocal 
space can be restricted to an extended irreducible zone 
without loss of accuracy. Furthermore, in the presence of 
inversion symmetry, the all-electron mixed product basis 
can be transformed in such a way that response matrices 
become real instead of complex, which again reduces the 
numerical cost in terms of computation time and memory 
demand. 

We further demonstrate that we can afford to truncate 
the number of basis functions considerably in the cal- 
culation of the correlation part of the self-energy. This 
is achieved by a basis transformation that diagonalizes 



the Coulomb matrix. Eigenvectors with small eigenval- 
ues then correspond to unimportant scattering contribu- 
tions, which can be neglected in a systematic way. This 
leads to an optimization of the basis set and thus to a 
speed up of the computation. 

The paper is organized as follows. In Sec. |lTl we give a 
brief introduction to the GW approximation. In Sec. lIIIl 
we describe our all-electron implementation in detail: the 
mixed product basis and its optimization, the k-point set, 
the treatment of the F-point divergence, and the usage of 
spatial and time-reversal symmetry. Section IIVI reports 
convergence tests for Si and SrTiOs as well as fundamen- 
tal GW band gaps for a variety of semiconductors and 
insulators. In addition, results for the localized 3d states 
of GaAs and ferromagnetic Ni as an example of a 3c? 
transition metal are discussed. In order to illustrate the 
efficiency of the code, we also show CPU timings for dia- 
mond in supercell geometries containing up to 128 atoms. 
Finally we summarize our conclusions in Sec. |Vl 



II. GW APPROXIMATION 

Angle-resolved photoelectron spectroscopy is the prime 
experimental technique for the measurement of the elec- 
tronic band structure of crystalline materials. The exci- 
tations measured in photoelectron spectroscopy involve 
electron ejection or injection and thus imply a change 
in the particle number by one. The corresponding many- 
body excitation energies -E^q, where q is the wave vector, 
n the band index, and a the electron spin, define the pole 
structure of the one-particle Green function. 



BZ all 



G'^{ry;u;) = J2J2- 



Cq(r)Cq(r') 



(1) 

with the quasiparticle wave functions ipnqi'r) and the 
many-body ground-state energy Eq. Here and in the 
following the number rj is infinitesimal, real, and posi- 
tive, and by a sum over Bloch vectors q, we always mean 
an integration over the BZ multiplied by the density of 
q points V/{8Tr^) with the crystal volume V. Hartree 
atomic units are used throughout except where noted 
otherwise. 

The quasiparticle wave functions '0^q(r) and energies 
obey a set of one-particle differential equations, 

/inV'"q(r) + J K,(r,r';i?„q) 

-<,(r')<5(r-r')]Cq(r')dV' = KA(r) , (2) 

the so-called quasiparticle equations, where Hq is the 
Kohn-Sliam (KS) Hamiltonian 

ho = -^V'+ i^cxt (r) + «H (r) -I- (r) (3) 

with the external, Hartree, and exchange-correlation 
potentials, respectively,^^ and EJ^(r, r';aj) is the non- 
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local, non-Hermitiaii, and energy-dependent exchange- 
correlation self-energy. 

In practical implementations one usually treats the in- 
tegral operator on the left-hand side of Eq. ([5]) as a small 
perturbation. In first order, the quasiparticle energies 
are then given by the nonlinear equation, 

Kq = <q + Kc iK^) - <c I (4) 

with the KS wave functions ip"^^ and energies ej^q. 
This is equivalent to neglecting off-diagonal elements of 
EJc(£'^q) — in the basis of the KS wave functions. For 
the self-energy, we use the GW approximation, 

S^,(r,r';c.) = ^ J G^o{r,r' ;u:+u:')W{r,r' ;uj')e'^-' du:' , 

(5) 

which constitutes the first-order term of the self- 
energy expansion in powers of the dynamically screened 
Coulomb interaction W. Here 

BZ all (7 / \ (7^^ f l\ 



7iq 



«?7SgIl(<q) 



is the time-ordered KS Green function, which is obtained 
from Eq. ([T|) by replacing V'nq(i") with the KS wave func- 
tions (^5^q(r) and E'^^ — Eq with the KS energies e^q mea- 
sured from the Fermi energy. 

The dynamically screened interaction obeys a Dyson- 
type integral equation, 

W{v,v']uj)^v{y,y') (7) 

+ / v{v, r")P(r", r'"; lo)W{v"' , r'; w) d?r" d?r"' 



with the bare Coulomb interaction w(r, r') = 1/ |r — r'| 
and the polarization function P(r,r';w), for which we 
employ the random-phase approximation^^ 



(8) 

GZ{y, r'; a;')GEJ(r', r; a;' - a;)e*'"^' dto' 



P{v,v'-uj) 

BZ occ unocc 



(T q.k 11 
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This approximation corresponds to time-dependent 
Hartree theory and thus neglects exchange-correlation 
(e.g., excitonic) effects in the dynamical screening. With 
the dielectric function, 

e(r, r'; u) = 5[v -^')- j '^(r, r'; w) dV" (9) 

we can write the screened interaction in the closed form, 

I^(r,r';w)= f s-\r,r" ;uj)v{r" ,r') d'^r" . (10) 



Many-body screening effects obviously enter with the sec- 
ond term of Eq. ^ into the formalism. In fact, if we write 
the screened interaction as a sum of the bare interaction 
and a remainder, 

W{r, r'; lo) = w(r, r') + W%r, r'; uj) , (11) 

then the self-energy [Eq. ([5|)] decomposes into the terms 

S^(r,r') = ^ J G^oiry;^ + ^'Hr,r')e^^-'dw' (12) 



and 

E,-(r,r';c.) = ^ J G^(r, r'; + c.')W^^(r, r'; c.') d^' , 

(13) 

which are identified as the exchange and the correlation 
contributions to the electronic self-energy, respectively. 
We note that the exponential factor allows to close the 
integration path over the upper complex half plane in 
Eq. p^ . As W'^{r, r'; uj) fahs off quickly enough with in- 
creasing frequencies, we may take the limit ?7 — >■ before 
integrating in Eq. 

In the next section, we discuss several aspects of the 
implementation that are important for the computational 
efficiency. The numerical procedure is based on an auxil- 
iary all-electron basis set, the mixed product basis, in 
which the previous integral equations become matrix 
equations that can be implemented easily. We have al- 
ready explained this basis set in detail in a previous 
publication^^ and only sketch the basic ideas here. 



III. IMPLEMENTATION 

A. FLAPW method 

In the FLAPW method, ■^^ space is partitioned into 
nonoverlapping atom-centered muffin-tin (MT) spheres 
and the interstitial region (IR). The core-electron wave 
functions, which are (predominantly) confined to the MT 
spheres, are directly obtained from a solution of the 
fully relativistic Dirac equation. The valence-electron 
wave functions with spin a are expanded in interstitial 
plane waves (IPWs) in the IR and numerical functions 
Kimpi^) = '^Zipi'^)Yim{er) iusidc the MT sphere of atom 
a, where r is measured from the MT center located at 
Rq. These numerical functions comprise solutions of 
the KS equation for the spherically averaged effective 
potential for p — and their first energy derivatives 

Khnii^) = 9u''^imoi^)/9eai for p = 1 evaluated at suit- 
ably chosen energy parameters ej^j, and Yim{ej.) denote 
the spherical harmonics. The notation — r/r with 
r = |r| indicates the unit vector in the direction of r. In 
a given unit cell, the KS wave function at a wave vector 
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k with band index n and spin a is then given by 



I 



1 



^^0 m^-l p^O 



1 nkcr 



/ J almp almp 



i(k+G) 



(r - Ra) 

if r e MT(a) 
if r e IR 



|k+G|<G„ 



(14) 

with the crystal volume V, the number of unit cells N, 
and cutoff values l^ax and Gmax- The coefficients A^f^'^^ 
are determined by the requirement that the wave func- 
tion is continuous in value and first radial derivative at 
the MT sphere boundaries. If desired, additional local or- 
bitals for semicore states^ or higher energy derivatives^'' 
can be incorporated by allowing p> 2. We use the fleur 
code^ for the density-functional theory (DFT) calcula- 
tions. 



B. Mixed product basis (MPB) 

If the integral equations of Sec. |lT] are rewritten in an 
auxiliary basis set, they become matrix equations, which 
can easily be treated in a computer code using standard 
linear-algebra libraries. Equation ([8]) already indicates 
that this auxiliary basis set should accurately represent 
wave-function products. This is generally true for all 
quantities that involve two spatial coordinates and thus 
couple two incoming and outgoing electrons with each 
other. 

The FLAPW method uses continuous basis functions 
that are defined everywhere in space but have a different 
mathematical representation in the MT spheres and the 
IR. For the expansion of wave-function products, how- 
ever, it is better to employ two separate sets of functions 
that are defined only in one of the spatial regions and 
zero in the other. In this way, linear dependences that 
occur only in one region can easily be eliminated, which 
overall leads to a smaller and more efficient basis. The 
resulting combined set of functions is called the MPBJ^ 

Inside the MT spheres, the MPB must accurately de- 
scribe the products KLpii^Xi'm'p'iT^)- The angular 
parts Yj^(er)y//„j' (er) can be represented by linear com- 
binations of spherical harmonics yLA/(er) with \l — l'\ < 
L < I + I' and —L < M < L, while the radial parts de- 
fine a set of product functions U^]^p{r) — u'^ip{r)u'^i, p, {r) , 
where the index P counts all possible combinations of 
Z', p, and p' . We emphasize that, in general, the latter lie 
outside the vector space spanned by the original numer- 
ical basis functions u'^ip{r). Initially, the set {U^]^p{r)} 
is neither normalized nor orthogonal and usually has a 
high degree of (near) linear dependence. An effective 
procedure to remove these (near) linear dependences is 
to diagonalize the overlap matrix and to retain only those 
eigenvectors whose eigenvalues exceed a specified thresh- 
old value. In this way, the MT functions become or- 
thonormalized. By using both spin-up and spin-down 



products in the construction of the overlap matrix, we 
make the resulting basis spin- independent. In practice, 
the basis set is reduced further by introducing a cutoff 
value Lmax for the angular quantum number. On the 
other hand, it must be supplemented with a constant 
MT function for each atom in the unit cell, which is 
later needed to represent the eigenfunction that corre- 
sponds to the divergent eigenvalue of the Coulomb ma- 
trix in the limit k 0. From the resulting MT functions 
MaLMpiy) = MaLp{r)YLM{e,,), we formally construct 
Bloch functions. 

In the IR, we use a set of IPWs with a cutoff G^jj^x ^ 
2Gmax in reciprocal space, since the product of two IPWs 
yields another IPW. Together with the MT functions, we 
thus obtain the MPB {Mf{r)} = {M^^]^jp{r), M^{r)} 
for the representation of wave-function products. Unlike 
the MT functions, which have been explicitly orthonor- 
malized, the IPWs are not orthogonal to each other; the 
elements of their overlap matrix can be calculated ana- 
lytically and are given by 

{M^\M^,) = ^kk'OGG'(k) = S^^-Bg-g' , (15) 

where 9g are the Fourier coefficients of the step function, 
which equals 1 in the IR and in the MT spheres. We 
also define a second set, the biorthogonal set. 



G' 



It fulfills the identities 



Y,\MY){MY\=Y.\M't){MY\^l 



(16) 



(17a) 



(17b) 



where the completeness relation is only valid in the sub- 
space spanned by the MPB, though. As the MT func- 
tions and the IPWs are defined in different regions of 
space and the MT functions are orthonormal, only the 
IPWs overlap in a nontrivial way. It should be noted 
that the overlap matrix is k-dependent because the size 
of the MPB varies for different k vectors. 

In general, the matrix representation of real operators 
in an arbitrary complex basis {/^(r)} is Hermitian. If the 
system has inversion symmetry and the basis functions 
fulfill 



/^(-r) = /*(r). 



(18) 



it is easy to show that the matrices become real sym- 
metric. Of course, this reduces the computational cost 
considerably, in terms of both memory consumption and 
computation time. However, according to the current 
definition only the IPWs fulfill Eq. ^ while the MT 
functions do not. For a system with inversion symmetry, 
we hence apply a unitary transformation of the MT func- 
tions such that Eq. is satisfiedj2I In the following, it 
is understood that all quantities are represented in this 
symmetrized basis if inversion symmetry is present. 
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C. Formulation in the MPB 

In this section, we reformulate the equations of Sec. |ll] 
by projecting onto the MPB and exploiting the identities 
in Eq. (fT7|) . Because of the exponential factor in Eq. (IT^ , 
we can formally close the frequency integration contour 
with an infinite half circle over the positive imaginary 
plane without changing the value of the integral. This 
contour integral then equals the sum over the residues of 
the poles of the Green function. The expectation value 
of the exchange term with respect to a wave function 
tPnqir) yields the well-known Hartree-Fock expression 

BZ occ. 

if^qiKiv'Zc,) = - E E E ^^'^(k) (19) 

k n' I, J 

with the projections 

(M,V^q|^;^,q+k) = / A/r(r)^r.q(r)^r.'q+k(r)dV 

(20) 



and the Coulomb matrijs^S 



vij{k) = {Mf\v\M^) = 



(21) 

The sum over the occupied states also comprises the core 
states, which give an important contribution to the ex- 
change self-energy. Its evaluation is simplified consider- 
ably by the fact that the core states can be treated as 
dispersionless bands. We use a formalism derived by Da- 
gens and Perrot.^^ An efficient scheme for the calculation 
of the full nonlocal Fock exchange potential including off- 
diagonal elements will be presented elsewhere<^ 

With the projections [Eq. ([201) ] we readily obtain the 
representation of the polarization function, 



(22) 



BZ occ. unocc. 

EEE E (^^/Vr.ql^n'q+k)(^;^'q+kl¥';^qM 
cr q n n' 

( 1 



1 



V W + <q - <'q+k + - <q + <'q+k " / 

The rational expression in the brackets complicates a 
direct summation over the Brillouin zone. It is more 
convenient to consider the representation (ImP)/,/(k, cj) 
of the imaginary part ImP(r,r';a;) first, which is ob- 
tained by replacing expressions of the form 1/ (a ± ir]) 
by ^7r(5(a). This simplifies the BZ summation signifi- 
cantly. Afterwards a Hilbert transformation yields the 
full polarization matrix P/j(k, w), where the frequency 
argument may be complex. In particular, this allows an 
evaluation on the imaginary-frequency axis, where the 
frequency-dependent quantities show a smooth behavior 



and can therefore be sampled and interpolated with few 
frequency points. As the bracket in Eq. ([2^ is real for 
frequencies on the imaginary axis, the corresponding ma- 
trix P/j(k, iw) with w G K becomes Hermitian; it even 
becomes real symmetric if the system exhibits inversion 
symmetry and we use a symmetrized MPB as described 
in Sec. irilBl 

In the MPB, the integral equations for the dielectric 
function [Eq. and the screened interaction [Eq. ([TU[) ] 
turn into simple matrix equations. The equations become 
particularly simple if we perform a basis transformation 
{MY{v)} {£'^'^(r)} that diagonalizes the Coulomb ma- 
trix. We note that no approximation is involved at this 
stage. The new normalized basis functions are necessar- 
ily orthogonal, and we do not need a biorthogonal set. 
In this new basis the matrix equations become simple 
products. 



epi.(k,w) = (5^,v - y^i;^(k)Ppi.(k,w)v/i',.(k) (23) 
I^^,(k,c.) = J^)e-l{}^,u:)./^) (24) 



with the eigenvalues Wp(k) of the Coulomb matrix 
[Eq. pTI)]. Here we use a symmetrized definition of the 
dielectric matrix e^ii/(k, oj) that is Hermitian (or real sym- 
metric in case of inversion symmetry) for imaginary fre- 
quencies and remains finite at the P point. It is easy to 
verify that the screened interaction remains unchanged 
by this symmetrized formulation. 

In contrast to the exchange self-energy, the frequency 
integral in Eq. ([T3| cannot be replaced by a sum over 
residues because the positions of the poles of T4/'^^(k, uj) = 
(k, uj) — 6 fi^v fj^{\<i) in the complex- frequency plane are 
unknown. Therefore, the correlation self-energy 

(^;^q|I]^(^)|^^q) (25) 
BZ all 



= ^ E E E(^"'q+kl'/'nq^^^)(^.VnqK'q-hk) 

W^M^(k,^') 



k n' fi.i' 
oo 

diu' 



uj + uj' ~ e^,^k + ^1 sgn«'q+k) 



still contains an explicit integration over frequencies. Un- 
fortunately, the integrand has a lot of structure along 
the real frequency axis, which makes a direct evaluation 
difficult. There are two methods that avoid the integra- 
tion over real frequencies and use the imaginary axis in- 
stead: analytic continuation^^ and contour integrationi^ 
The former allows a faster and easier implementation, 
but contains a badly controlled fitting procedure, which 
can be tested with the more accurate contour-integration 
method. We have implemented both algorithms and find 
that they give similar results for the systems considered 
here. In the following, we hence focus exclusively on the 
first approach, which is based on an analytic continuation 
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of Eq. (|25p to the imaginary-frequency axis, 

(26) 

BZ all 
k n' fJ.jL' 

J -co + - <'q+k 

The integration contour can be closed over the positive 
imaginary and negative real half-plane in Eqs. (|25p and 
respectively, and encloses the same poles. Now the 
frequency integration is along the imaginary-frequency 
axis, where the integrand is much smoother. In practice, 
we use a discrete and finite mesh for the imaginary fre- 
quencies, which is dense near w = 0. A tail is fitted to the 
last mesh point according to the known asymptotic 
behavior of the screened interaction. Between the mesh 
points, we interpolate W^j^(k, a;(A)) with cubic splines, 
where aj(A) = A/(l — A) maps the interval [0, 1[ to [0,cx)[. 
This allows a stepwise analytic integration. With this 
procedure only a small number of mesh points is needed, 
typically around 10. 

Once the correlation self-energy is calculated on the 
discrete imaginary-frequency mesh, we analytically con- 
tinue it to the whole complex plane by fitting the model 
function, 

/(cj) = °P for Rew>0 (27) 

p—i ^ 

with complex fit parameters ap and ujp. Due to the lo- 
cation of the poles of the correlation self-energy in the 
complex plane - above the real axis for Recj < and 
below the real axis for Rew > - one must analytically 
continue from the positive imaginary axis to the positive 
real axis. For symmetry reasons one then obtains 

a* 

/(^)=E^7^ f""" Rec^<0 (28) 
p—i P 

on the negative real axis. In principle, Np is a con- 
vergence parameter. However, as the number of imag- 
inary frequencies where {ip'^^l'S'^ {iLu)\tp'^^) is known is 
relatively small, and as the fitting procedure quickly be- 
comes prohibitive for large numbers of fit parameters, 
one usually uses only few poles, e.g., Np = 3. After find- 
ing the parameters Up and Wp, the correlation self-energy 
((/?^q|Ejr(a;)|i^5^q) is approximated by the analytic func- 
tion f{<jj), which allows to solve the nonlinear quasipar- 
ticle equation (jj]) to machine precision with the standard 
iterative Newton method and without any additional lin- 
earization of the self-energy. 

D. Brillouin-zone sampling 

Both the polarization function and the self-energy are 
defined as products in real space. These become the 



convolutions [Eqs. (fTO|) . (P^ . and (US])] in a reciprocal- 
space formulation, which is better suited for infinite peri- 
odic systems because all nonlocal quantities then become 
block diagonal. We employ the tetrahedron method for 
summations over the BZ.^^ 

These equations do not only contain the two Bloch vec- 
tors k and q, but also their sum k -|- q, at which the KS 
wave functions and energies must be known. Therefore, 
we choose the set of k points k„j„2„3 = Yl^i=i ^i^i/^i 
with rii = 0, A'i — 1 and the reciprocal basis vectors 
bj. We denote the number of k points by iV^ = N1N2NJ,. 
It naturally includes the point k = 0, which is special be- 
cause the long-range nature of the Coulomb interaction 
makes the Coulomb matrix vij (k) and also the screened 
interaction W^ti,(k, oj) diverge in the limit k — 0. This 
divergence must be taken into account in order to obtain 
fast convergence with respect to the k-point sampling. 
We will discuss a numerically stable and efficient treat- 
ment in the next section. 



E. F-point treatment 

The exchange and correlation self-energy contributions 
in Eqs. ([T^ and (|26p each contain a sum over the BZ. 
The interaction potentials w/,/(k) and W^ii,(k, w) diverge 
in the limit k — > but as this pole is only of second order 
(1/fc^) , a proper three-dimensional integration over k = 
will yield a finite value. 

Likewise, the calculation of the dielectric matrix 
[Eq. ((23| ] involves a product of the polarization function 
and the divergent Coulomb matrix. However, a closer in- 
spection of the polarization matrix [Eq. (1^^ ] in the basis 
{i?Jj(r)} shows that the head element Pii(k, w) and the 
wing elements P^i(k, cj) and Pip(k, w) with /i > 1 are 
of the order and fc, respectively, so that the dielectric 
matrix e(k, w) remains finite but angular-dependent at 
k = OP- 

In any case, the divergence gives an important con- 
tribution to the self-energies and response functions and 
must be treated with care. There are several numeri- 
cal approaches. Kotani and van Schilfgaarde^?^ replaced 
the r point by three additional points kg nearby, the so- 
called offset r points. These might be reduced by sym- 
metry, e.g., to a single point in the case of cubic symme- 
try. However, because of the summations in Eqs. ([T^ . 
(122), and (|26p . each additional point kg requires a com- 
plete auxiliary mesh {kp + q, q € BZ} on which the KS 
Hamiltonian must be diagonalized and the resulting wave 
functions and energies must be stored. This at least dou- 
bles the k-point set, thus increasing the numerical cost 
in terms of computation time and memory demand. In 
another approach, Ku and Eguiluz^ - as well as Puschnig 
and Ambrosch-Draxli^ used a plane-wave basis for the 
Coulomb potential and related propagators and thereby 
departed from a complete all-electron description because 
plane waves are too inflexible to resolve the rapid vari- 
ations in the wave functions close to the atomic cores 
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without a prohibitively large basis set. 

Here we present a scheme that does not require ad- 
ditional k points or projections onto plane waves. It 
thus combines the accuracy of an all-electron approach 
with the numerical efficiency of a minimal k-point set. 
In the following, we present the algorithm for the two 
self-energy contributions and the dielectric matrix. 



1. Exchange self-energy 

If the k-point summation in Eq. (I19p is replaced by an 
integral, we can smoothly integrate over the divergence of 
viji\f.) and obtain a finite value. To this end, we formally 
consider the Fourier transform 



Y I- g--i(k+G)-rgi(k+G')-r' 
V 

47r 



T^-'^GO'^G'O + ^'GG'(]^) 



(29) 



with the nondivergent and infinitely large matrix 
WQQ/(k) = (1 — 5Go)<5GG'47r/|k -|- Gp. In a previous 
publication,'^'' we found an analogous exact decomposi- 
tion of the Coulomb matrix [Eq. ((2T|) ] into the same di- 
vergent term and a nondivergent remainder Ujj(O). Thus 
no projection onto plane waves is necessary, and we re- 
tain the full accuracy of our all-electron formulation. Re- 
placing Wqq, (0) by W7j(0) and inserting Eq. ([29)) into 
Eq. p9|) leads to contributions from the divergent term 



(<^nq|Sxl<^nq)dr 



V 



(30) 

with the occupation numbers f"^ and from the nondiver- 
gent term 



('^«q|Sxl'/'^q)ndiv 
BZ occ. 



(31) 



k n' /,J 



where we have set W/j(0) — > vij{Q) for simplicity. The 
divergence of the Coulomb matrix is restricted to the 
first term of Eq. ([^^ , and the corresponding eigenfunc- 
tion is e^^''^ /^/V, whose k — )• limit, a constant func- 
tion, can be represented exactly by the MPB. The prod- 
ucts of 1/fc^ with higher-order terms of the projections 
(e''' ""iy9^q|(/3j^,q_i_jj.)/\/y cau bc of zeroth order and then 
lead to additional contributions to Eq. ([?T|) for k = 0. 
Therefore, the projections must be expanded with the 
help of k • p perturbation theory. We have found that 
these corrections improve the k-point convergence sig- 
nificantly in some cases, while in others - especially for 
small band-gap semiconductors like GaAs - they worsen 



the convergence. The behavior also depends on the par- 
ticular electronic state If/S^q). For simplicity, we defer an 
in-depth discussion to a later publication. 

In order to be able to integrate analytically, we extend 
the BZ integral in Eq. ([5D[) to the whole reciprocal space 
by replacing with 



3-/?|k+G|- 

Ik+GP 



(32) 



Note that F(k) diverges at every reciprocal lattice vector 
G. The exponential function was included to ensure the 
convergence of the sum everywhere else. This function is 
formally identical to the one used by Massidda et al. in 
Ref. |33j. However, these authors define /? as a parame- 
ter depending on the BZ size. Instead, we choose (3 to 
be as small as possible (while still allowing a sufficiently 
fast converging sum over G), independently of the BZ. 
First, this ensures that supercell calculations of the same 
system yield identical values. Second, zero-order terms 
arising from products of the exponential function with 
1/fc^ are small and can be neglected. After replacing 
1/fc^ with F{\i) and extending the integral and summa- 
tion over the whole reciprocal space we obtain 



(<^nq|S2l</'nq)dr 



(33) 



4'r/:q Uz^ — 



87r3 



E 



0<|q| 



where the sum runs over all vectors q = k + G 7^ and 
is the unit-cell volume. As both terms diverge for /3 — > 0, 
we introduce a cutoff radius go and finally obtain 



(V^q|Sxl¥',%)dr 



= -1" 
J nc 



(34) 



E 



47r 



0<|q|<9o 



We get rid of go as a convergence parameter by choosing 
g-^go = ^ as a cutoff criterion for the summation. In 
practice, we find that /? = 0.005 is typically small enough. 



2. Response functions 

The treatment of the polarization and dielectric func- 
tion in the limit k — > is simplified considerably by the 
basis transformation {Mf{r)} — ^ {£'^(r)} introduced 
in Sec. IIII CI because it confines the divergence of the 
Coulomb matrix to a single eigenvalue t;i(k) ^ An/k'^ 
(Ref. [20I ). The corresponding eigenfunction E^{r) ~ 
gik-r^y/y is known analytically. 

Let us consider the projection [Eq. ([^n[) |. where the 
MPB function is replaced by this eigenfunction. Be- 
cause of the orthogonality of the wave functions, we have 
{e^^''^^Zq\'PZ'q+k) ^nn' in the limit k 0. For the mo- 
ment, we restrict ourselves to the case of semiconductors 
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and insulators where the band indices n for occupied and 
n' for unoccupied states always differ such that in lead- 
ing order (e*'' '"(/3^q|</5^,q^j^) ~ k. The linear order in k 
exactly cancels the singularity of the Coulomb matrix 
in the dielectric function [Eq. (|23p ] and can be calculated 
with k-p perturbation theory,— which allows a full treat- 
ment of the divergence and the dielectric anisotropy at 
k = 0.'^'* In this way, the matrices decompose into head, 
wings, and body as in a simple plane- wave basis set. Still, 
the all-electron accuracy is fully retained and no projec- 
tion onto plane waves is necessary. 

For the sake of completeness, here we give the exact 
expressions for the polarization function and the screened 
interaction, taking into account the full anisotropy. From 
k • p perturbation theory, one obtains the form 



/ ejH(w)ekA;2 ejs2(a;)fc 

-P22(W) 



elsl{uj)k 



Bjs„(w)fc \ 



(35) 



for the polarization function in the limit k — 0. Here 
H(aj) is a 3 X 3 matrix, s^(cj) are three-dimensional vec- 
tors, and the matrix elements P^i^{uj) are finite. We note 
that for frequency arguments that are not purely imag- 
inary the matrix P^i^(k, w) is not Hermitian; in particu- 
lar, the horizontal and vertical wings are then not simply 
the complex conjugates of one another. Otherwise, the 
formalism is very similar to the one given here. The cor- 
responding screened interaction becomes 



W^^(k, iuj) 



/ ••• \ 

( Air/k'^ 



ei^L(w)ek 



(36) 



ejw2 {ijj)/k 
e^w^(w)/fc |ejw2(a;)| 



I 



ej^w„(a;)/fc \ 
[eTw^(w)] [ejw„(w)] 



|ejw„(w)| 



with the finite matrix elements 



(37) 



v,,{0)P^,{io)^v,{()), (38) 



where fj.jV > 2. The divergent and, in general, angular- 
dependent second term derives from the head and wing 
elements of Eq. with 



there is a contribution from intraband transitions across 
the Fermi surface. These transitions occur within one 
electron band, i.e., n = n', in which case the projection 
above becomes unity in the limit k — ^ 0. However, it 
can be shown'^^ that the expression in the brackets of 
Eq. (|22p will then be of linear order in k and that we 
obtain a contribution only for the head element of the 
polarization matrix, the so-called Drude term 



w^(c^) = ^/4^^ e7J(L.)s,(^)v/^ (39) 



v>2 



and 



L(w) - 1 - 47rH(w) - Vi^Y,s^{oj)wl{uj)^v,,{0) . 

fj,>2 

(40) 

Let us now turn to the case of a metallic system, where 
in addition to the interband transitions with n ^ n' , 



pD(k,z^) 



47r Ijj(lu + irj) ' 



(41) 



where iOp\ is the plasma frequency obtained by an inte- 
gration over the Fermi surface. The Drude term gives 
rise to a contribution a;pj/[a;(aj -I- ir])] for the head ele- 
ment of the dielectric matrix [Eq. (|23p ]. which will mix 
with all other elements in the inversion for the screened 
interaction [Eq. ([M)) [. However, we find that VF{^]^(k,(u;) 
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is dominated by the bare Drude term 



in 



fc2 



(42) 



in the hmit k 0. As this expression can be convo- 
luted with the Green function in Eq. (pS)) analytically, 
we subtract it from the head element and treat the re- 



mainder Wii(k,iuj) — (k, iuj)\i^^o numerically. The 
treatment of the l/fc^ divergence is explained in the next 
section. 



3. Correlation self-energy 

The BZ summation over the l/fc^ divergence in the 
correlation self-energy [Eq. (pS)) ] can be treated with the 
same procedure as outlined in Sec. IIII E II for the ex- 
change self-energy. However, in this case, there are ad- 
ditional 1/fc terms, and all divergent terms exhibit an 
additional angular dependence. 

As a first step we describe this angular dependence 
with the help of spherical harmonics. For example, for 
the head element we must find the coefficients K^2i)m{'-^) 
in 



k • p perturbation theory, which is used to compute the 
higher-order terms. 

The head element exhibits a l/Zc^ divergence, which 
can be treated in the same way as in Sec. IIII E II We 
obtain a contribution 



{Vnci\K{i^)\vlq)div 



(45) 



V 



BZ 



where the frequency integration is performed as described 
in Sec. IIII CI All other elements K'^^^-^^_^{ijj) as well as 
the divergent wing elements of Eq. ([551) need not be 
taken into account, as their angular parts integrate to 
zero. However, we note that there is a finite contribution 
of these elements from multiplications with higher-order 
terms of the other quantities - a contribution that we 
neglect here for simplicity, as previously mentioned. 



F. Optimization of the MPB 



47r 



ei^L(a;)ek 



oo 21 

E E 

i=0 m=-2i 



^(2;)m(t^)>^(2/)m(ek) ■ (43) 



If we multiply with the denominator ejL(tj)ek = 
ELoE!n=-/-^(2;)m('^)^(2;)m(^k), usc the Gauut coeffi- 
cients and the orthogonality of spherical harmonics, we 
obtain an infinite system of linear equations, from which 
the coefficients K(^2i)m{'^) can be deduced. The corre- 
sponding expansions for the wing. 



ejL(a;)ek 



2/+1 

E 

/=0 m=-(2i + l) 



E 



('^)^(2;+i)m(ek) 
. 

and body matrix elements can be calculated in a similar 
way. Finally, subtraction of 5^i,vV^{0) with ui(0) ~ Air/k'^ 
yields the expansions for the head and body of W^. 
Then K^^dJu;) = if(20m(^) - (4^)'/''^/o and W^,{u;) = 
W^^{uj) - (5^,.Wp(0) replace K(^2i)m{^) and Wp^(a;), re- 
spectively. 

The body matrix elements of the second term in 
Eq. (j36p are angular-dependent but finite. Then all terms 
with / > integrate to zero, and we retain only the con- 
stant term / — 0, which we simply add to W^i^. The head 
(wing) elements diverge with a factor As in 

Sec. IIII E 1| multiplication with higher orders of the pro- 
jections and the term l/{iuj + iuj' — e'^,^_^j^) leads to terms 
of zeroth order in k. Again we do not discuss these terms 
explicitly, as they improve the results only in some cases, 
while in others, they can lead to numerical problems. 
This can be attributed to the energy denominators of 



If we assume that the eigenvalues (k) are ordered ac- 
cording to decreasing size, then matrix elements e^^ (k, uj) 
and Wfii,(k, u) with large indices will be relatively small, 
cf. Eqs. and ([21]) . We may then introduce a thresh- 
old value Vjnin for the eigenvalues and only retain the 
functions E^{r) with w^(k) > u„iin- As the eigenvalue 
f^(k) can be viewed as a measure for the importance 
of the corresponding function E^{r) in t;(r), we restrict 
ourselves to the dominant part of the electron-electron 
interaction in this way. The removal of basis functions 
with small eigenvalues can be viewed as an optimization 
step of the MPB, because it reduces the matrix sizes and 
hence the computational cost considerably without com- 
promising the accuracy, as we show in Sec. IIVI below. We 
there also demonstrate that the results converge reason- 
ably fast with respect to the threshold parameter Vmin- 
Note that with Vmin — 0, the full accuracy of the MPB 
is restored. In our implementation, this optimization of 
the MPB only affects the correlation self-energy while we 
always calculate the exchange self-energy with the full 
MPB. 



G. Use of symmetry 

The evaluation of Eqs. (HH), and takes con- 

siderable computation time, which can be reduced sub- 
stantially by exploiting spatial and time-reversal symme- 
tries of the system, the latter in case of a system without 
inversion symmetry. Allowed symmetry operations are 
those that leave the Hamiltonian invariant. With these 
operations, the set of k vectors decomposes into groups of 
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equivalent vectors, which are equivalent in the sense that 
all elements of the group can be generated by applying 
the symmetry operations to an arbitrary representative 
of the group elements. As a consequence, any physical 
quantity defined for the representative k vector can be 
mapped to any other vector of the group by a suitable 
symmetry operation. This reduces the full BZ sampling 
to the smaller set of representative vectors, which form 
the irreducible BZ (IBZ). 

We may thus restrict P/j(k, w) to k e IBZ. The sum- 
mation over q points in Eq. (j22p . on the other hand, can- 
not be confined in the same way, because the terms to be 
summed also depend on k (and q -I- k). However, we can 
restrict the q vectors to an extended IBZ [EIBZ(k)] that 
is defined in the same way as the IBZ above but with 
the subset of symmetry operations that leave the given 
k vector invariant. 

Let us define the complete set of Na symmetry opera- 
tions by 



Sa = {A = {Ai,a,,ai) 



1, 



,Na}, (46) 



where A; and a; denote the 3x3 rotation (or rotoinver- 
sion) matrix and a translation vector (which is nonzero 
for nonsymmorphic operations), respectively, and ai 
equals —1 (-1-1) for operations with (without) time rever- 
sal. The action of Ai on a spatial vector r, a momentum 
vector k e BZ, and a function /(r) is declared by 



Aik = aiAik + G , 



V[A-i(r-a.)] fora, = 1 
r[A-i(r-a,)] fora, = - 



(47) 
(48) 

(49) 



where the reciprocal lattice vector G folds aiAik back 
into the BZ. Furthermore, we define the subset 

S^^{A^\i^l,...,N'^;A^k = k}eSA (50) 



that generates the EIBZ(k). 



Now we reformulate Eq. (I22p using the definition of the 
EIBZ(k) and that AiLp'^^{r) is a valid wave function with 
the momentum vector A^q, 



EIBZ(k) 



(51) 



occ unocc 



EIBZ(k) 



occ unocc 

EE 



{A^Mfcp^r.ci\^:,^+k) 



where is the number of equivalent q vectors with re- 
spect to Sa , Ti the identity, and T_i the transpose op- 
erator r_i_B/j — Bjj. From the definition of the MPB, 
it is clear that the application of an arbitrary symme- 
try operation A^ G S\ to any M]'(r) can be written as 
a linear combination of the basis functions Mj(r), such 
that the sum over the symmetry operations in Eq. (|5ip 
can be performed at the very end after summing over the 
bands, the EIBZ(k), and the spins. We note that this is 
also possible with the set {i?'^(r)} instead of {M]'(r)}. 



In a similar way, we can accelerate the evaluation of 
the expectation values of and Yj'^{uj). To this end, we 
write Eqs. (|19p and (I26p in a common general form with a 
function /(r, r'), which fulfills all symmetry properties of 
the system. By confining the summation over k points to 
the EIBZ(q) and summing over the symmetry operations 
we obtain 



BZ 



KqiSKq) = EE d^rd^r'^:lir)ipZ,^ir)^^J'^ir')^:^{r')f{r,r') (52) 



N'i EIBZ(q) „ 

= E E T^E // ^'^'^''-'^qWt^^ ^Wir)][A^ <,k(0]<q(0/(r,r') 

^=l k „, J J 

EIBZ(q) q 

= E E T^E // ^'^'^''''[^^»q(rM'kW^«'k(0[i^^^^^^ 
i=l k A „, J J 



EIBZ(q) 



= E E E E ^,n^ncA?,n'nc | E / / ^''^ '^"'k W '^n'k (r') ^P^^'q i^') /(r, r') , 

k ^ m m' \ j=l 
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where we have restricted ourselves for simphcity to sym- 
metry operations without time reversal. ^fmnq ~ 
((p^q|A^|(/5^q) is the matrix representation of Af in terms 

of the wave functions. As A'^ commutes with the Hamil- 
tonian, the element can only be nonzero if the 

corresponding energies and e^q are identical. Let us 
assume that (p'^^^ and ip'^^ lie in the eigenspace formed by 
(pj^q with ni < v < n2. By construction Af „„q is then an 
irreducible representation, and we may apply the great 
orthogonality theorem of group theory, "^^ 



(a) 



i^mnci i.7n' n<i 



E 

i=l 

which finally yields 

EIBZ(q) 



712 —ni + 1 



(53) 



(54) 



n2 — ni-\-l 

x<q (r')/(r,r') 



m—ni n' 



The k-point sum is thus reduced to the EIBZ(q), but we 
have to average over the degenerate states ni < m < n2- 
However, the gain in computation time by a restriction 
to the EIBZ(q) usually outweighs the overhead from the 
summation over the degenerate states by far. For sym- 
metry groups with time-reversal symmetries the deriva- 
tion can be done analogously with a more general great 
orthogonality theorem. '^^ The final result is identical to 
Eq. IMl). 



T 
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T 





Figure 1: Convergence of (a) the r25'v ^ Fisc and r25'v — >■ 
Xic gaps of Si and (b) the Fisv F25'c and Rig/v T25'c 
gaps of SrTiOs as a function of the MPB cutoff parameter 
imax for the angular quantum number. 



IV. TEST CALCULATIONS 

We have implemented above algorithm in the computer 
program SPEX. In the following, we first show detailed 
convergence tests for Si and SrTiOs. Silicon is a pro- 
totype semiconductor, for which many GW calculations 
already exist and which is therefore used as a benchmark 
material. Strontium titanate is a prototype transition- 
metal oxide, which crystallizes in the frequently occur- 
ring perovskite structure. It is currently explored as a 
high-K dielectric and a promising barrier material in spin- 
tronics and nanoelectronics. The valence and the lowest 
conduction bands are formed by O 2p and Ti 3d states, 
respectively. We explicitly include the semicore 3s and 
3p states of Ti as well as the 4s and Ap states of Sr 
with the help of local orbitals and take their contribu- 
tion to the screening into account. All these states are 
accurately described by the FLAPW basis set. Addi- 
tional local orbitals are used to improve the description 
of high-lying unoccupied states. As reference, we also 
show an overview of GW results for a wide range of semi- 
conductors including BaTiOa and compare them with 
experimental and theoretical values from the literature. 



Furthermore, the efficiency of our scheme is illustrated 
by calculations for diamond supercells containing 16 and 
128 atoms. 

The numerical procedure involves a number of conver- 
gence parameters, which determine the accuracy of con- 
volutions in real space (MPB), reciprocal space (k-point 
set), and the frequency domain. Since the latter two 
apply to essentially all electronic-structure methods, we 
concentrate mainly on the parameters for the MPB here. 
Specifically, we consider the cutoff parameters L^ax for 
the angular momentum inside the MT spheres and G'^^^ 
for the IPWs as well as the threshold v^nin for the opti- 
mization of the MPB according to Sec. IIIIFl We also 
discuss the convergence with respect to the number of 
unoccupied states for the summations in Eqs. ([^^ and 
All calculations are done with a 4x4x4 k-point set 
and the local-density approximation (LDA) (Ref. 37)for 
the exchange-correlation functional at the DFT level. 

The MPB is designed as a basis for the products of the 
wave functions [Eq. ([14])] , for which MT functions with 
angular momenta as large as ^max = 8 or even larger are 
typically taken into account. As a consequence, an exact 
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Figure 2: Convergence of (a) the — > Tisc and r25'v — > 
Xic gaps of Si and (b) the Fisv r25'c and Ris'v ^ 
gaps of SrTiOa as a function of the MPB cutoff parameter 



Gmax for the IPWs. 



Figure 3: Convergence of (a) the r25'v Fisc and r25'v 
Xic gaps of Si and (b) the Fisv — > r25'c and Rv — >■ r25'c gaps 
of SrTiOs as a function of ^J'iii /vmi-n- 



representation of the products inside the spheres requires 
spherical harmonics at least up to 2/iiiax- However, the 
high angular momenta in the original FLAPW basis are 
mostly needed to ensure an accurate matching to the 
IPWs and contribute little to the actual wave functions. 
In fact, we find that the cutoff parameter Lmax for the 
MPB can be choseir much smaller than 2Zmax and, indeed, 
even smaller than ^max- Figure [1] shows the convergence 
of the quasiparticle transitions r25'v ^ibc and r25'v 
Xic in (a) Si as well as Fisv — F25'c and Ri5'v ^2Wc in 
(b) SrTiOs with respect to Lmax- Convergence to within 

0. 01 eV is already attained with Lmax = 5 for Si and 

1, „ax = 4 for SrTiOs. With these cutoff values, the MPB 
contains 292 and 853 MT functions, respectively. 

A similar statement can be made about the conver- 
gence parameter GJ^ax ^'^^ IPWs. Although an ex- 
act representatioir requires twice the wave-function cut- 
off Gmax, which we choose as 3.6bohr~^ for Si and 
4.3bohr~^ for SrTiOs (giving rise to around 200 and 
550 augmented plane waves, respectively), again a much 
smaller cutoff parameter is sufficient for the products: 
Figure [2] confirms that convergence to within 0.01 eV is 



achieved with G'„^ = 2.7bohr and G' = 3.2bohr ^ 
for Si and SrTiOs, respectively, which yield around 100 
and 250 IPWs in the MPB. In general, we find that the 
ratio G^ax/'-'max ~ 0.75 can be used as a rule of thumb 
and works well for all materials considered here. 

In Fig. [21 we show the coirvergence of the gap ener- 
gies of Si and SrTiOa with respect to the threshold value 
Vrain defined in Sec. IIII Fl If the MPB was complete, the 
eigenvalues of the Coulomb matrix would be given by 
the Fourier transform t;G(k) — 47r/|k-|-Gp. With this 
in mind, the threshold value can be reformulated in terms 
of a cutoff in reciprocal space |k -I- G| < •\/47r/t;min, very 
similar to that for the IPWs discussed above. Therefore, 
it is reasonable to show the convergence in terms of this 
cutoff value, even though the MPB is only complete in 
the subspace spaimed by the wave- function products, of 
course, and the Fourier transform ?;G(k) is hence only an 
estimate for the eigenvalues. 

Figure [31^ a) shows the convergence of the quasiparticle 
transitions for Si with respect to v^tt/z^J^. We observe 
that the values are converged to within 0.01 eV around 
3.5 bohr~^, which corresponds to Wi„i,i = 1.9 ha. With 
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Figure 4: Rank of the matrix W for the screened interaction, 
which equals the number of basis functions after the optimiza- 
tion, for Si and SrTiOa as a function of •y/47r/i;min- 

these values the rank of the matrix W (see Fig. [4]) is 
reduced from 392 (the full MPB) to around 75 and the 
computation time from 140 to 42s on an Intel Xeon (2.66 
GHz, 4 MB cache) work station. Interestingly, the curve 
of the indirect transition in Fig. [S^a) exhibits a sudden 
step between 5.25 and 5.5 bohr^^, where the gap energy 
changes by 4 meV. A similar but much smaller step of 
0.7 meV can also be observed in the direct transition. 
Noting the simplified estimates i^G(k) for the eigenval- 
ues, this can be attributed to a shell of reciprocal vectors 
with length |k-|-G| that enter between the radii 5.25 
and 5.5 bohr~^ and give a sizable contribution. Although 
simplified, this is the correct picture, because the true set 
of eigenvalues Vf^ (k) usually contains many groups of de- 
generate eigenvalues, especially at high-symmetry points 
k in the BZ. By analogy, these groups can be viewed as 
shells of k -t- G vectors in reciprocal space. 

In SrTiOa, the gap energies converge somewhat less 
smoothly, but systematically. From Fig. ^h), we see 
that after 5.3 bohr"^ the energies change by less than 
0.01 eV. We note that the convergence criterion of 0.01 eV 
is quite ambitious for GW calculations. If we relax this 
criterion to, e.g., 0.05 eV, which should be sufficient for 
most studies, considerably smaller cutoffs suffice. 

Equations and (PE)) involve a summation over the 
unoccupied states fn'q+ici^)- practice, we must trun- 
cate this sum at some maximal band index N' . It is well 
known that a proper convergence of the GW quasiparti- 
cle energies requires very many unoccupied statesi^i^^ 
While for SrTiOs, a relatively modest number of 200 
states is sufficient, the band gaps of Si, in particular, the 
indirect one, are more difficult to converge. Recently, 
Bruneval and Gonze^^^ proposed an approximate scheme 
that corrects for the neglect of the states with indices 
n' > N' and only involves the states with n' < N'. They 
showed that this extrapolar correction reduces the num- 
ber of states needed for convergence considerably with 
only a small computational overhead. In short, all states 
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Figure 5: Convergence of the — ^ Xic gap of Si with 
respect to the number of bands used in the construction of 
the polarization function as well as the self-energy with (solid 
line) and without the extrapolar correction (dashed line). 



n' > N' are placed on a fixed energy above all others, 
which allows to take the energy denominator of Eqs. ((22|) 
and (pS)) out of the sum over n' and to use the com- 
pleteness relation for the one-particle states V'n'q+kC'')- 
The final expression can then be evaluated only with the 
states n' < N' . We have implemented this scheme in an 
all-electron code. Contrary to Ref. ;3^ we do not employ 
a plasmoii-pole model, though, but use the full matrix of 
the screened interaction in the correction. As shown in 
Fig. [5l we find a considerably improved convergence with 
respect to the number of bands, too. The fixed energy for 
the bands n' > N' is placed 16 Ry (217.7 eV) above the 
maximal energy of the bands n' < N'. However, all other 
results in this paper were obtained with the conventional 
summation. We note that the LAPW basis is a relatively 
small and accurate basis for the occupied states. In or- 
der to get enough unoccupied states for GW calculations 
it is therefore often necessary to extend the LAPW ba- 
sis by increasing the reciprocal cutoff radius Gmax and 
introducing additional local orbitals. 

For reference, we list the fundamental LDA and GW 
band gaps for a variety of semiconductors and insula- 
tors in Table HI together with experimental and other 
theoretical values for comparison. The latter are calcu- 
lated with the LMTO (Ref. T^) and the PAW method 
(Ref. SO). Our own results for the fundamental band 
gaps are converged to within 0.01 eV with respect to the 
numerical parameters, including the BZ sampling. We 
find that an accurate description of high-lying unoccu- 
pied states with additional local orbitals is crucial for 
properly converged GW results. The core states can also 
have a sizable effect on electron correlation and the re- 
sulting band gaps. For example, inclusion of the cation 
2s and 2p states of MgO and NaCl changes their band 
gaps by as much as 0.2 eV. Semicore states (e.g., Mg 2p) 
are described with local orbitals, while deeper core states 
(e.g., Mg 2s) are treated as dispersionless bands, whose 



14 



Table I: Fundamental GW band gaps for a variety of semi- 
conductors and insulators compared with experimental and 
theoretical values from the literature. We also indicate the 
LDA eigenvalue gaps. All values are in electron volts. 





LDA 


GW 


LDA'' 


GW 


LDA'' 


GW^ 


Expt. 


Ge 


0.02 


0.75 


-0.08 


0.57 


— 


— 


0.74'= 


Si 


0.62 


1.11 


0.46 


0.90 


0.62 


1.12 


1.17'' 


GaAs 


0.29 


1.31 


0.33 


1.31 


0.49 


1.30 


1.63'' 


CdS 


1.17 


2.18 






1.14 


2.06 


2.58'= 


GaN 


1.67 


2.83 


1.81 


3.03 


1.62 


2.80 


3.27' 


SrTiOs 


1.80 


3.36 










3.25s 


BaTiOa 


2.18 


3.18 










3.3'' 


CaSe 


2.04 


3.63 










3.85' 


C 


4.15 


5.62 


4.11 


5.49 


4.12 


5.50 


5.48'' 


BN 


4.35 


6.20 






4.45 


6.10 


5.97j 


MgO 


4.64 


7.17 


4.85 


6.77 


4.76 


7.25 


7.83'" 


NaCl 


4.90 


7.53 










8.5' 



''Reference 
''Reference 
'= Reference 
''Reference 
*= Reference 
'Reference 
s Reference 
''Reference 
'Reference [4^ 
^ Reference |4i 
''Reference 149 
' Reference ISC 



wave functions are confined to the MT spheres. Over- 
all our LDA and GW values agree very well with those 
of Ref. 113, but somewhat less so with the older Ref. [H. 
As expected, the LDA considerably underestimates the 
band gaps. The GW self-energy corrects this underesti- 
mation in such a way that the results come very close to 
the measured values. However, there is still a slight un- 
derestimation in most cases. It has been suggested that 
a self-consistent scheme could improve the GW values 
further.— 1^ The starting point is then optimized in such 
a way that the resulting one-particle orbitals are as close 
as possible to the quasiparticle wave functions; in partic- 
ular, closer than those from standard local or semilocal 
functionals. In this way, the first-order perturbative cor- 
rection [Eq. ([4])], where the quasiparticle wave functions 
are approximated by the one-particle orbitals, is better 
justified. However, self-consistent GW calculations are 
computationally very expensive. When compared with 
the electronic self-energy, the most obvious source of er- 
rors in the local and semilocal DFT functionals is the 
missing self-interaction correction, which influences the 
shape of the KS wave functions. Therefore, better results 
might alternatively be obtained if one uses a functional 
that treats electronic exchange more accurately, e.g., the 
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Figure 6: Band structure of Ni calculated by the LSDA 
(dashed lines) and the GW approximation (solid lines) for 
majority (left panel) and minority spin (right panel). The 
Fermi energy is set to zero. 



exact exchange functional within the optimized-effective- 
potential method or hybrid functionalsj^i^ These ap- 
proaches go beyond the scope of the present paper. Nev- 
ertheless, we note that the numerical procedure for the 
GW approximation presented here is independent of the 
starting point and could also be applied within a self- 
consistent scheme or to functionals containing an exact 
exchange term. 

As the GW approximation contains the exact exchange 
self-energy, it does not suffer from the unphysical self- 
interaction error present in local density functionals such 
as the LDA. Localized states are most strongly affected 
by this error and appear too high in energy within the 
LDA. Thus, the absence of the self-interaction error in 
the GW approximation should lead to an improved de- 
scription of these states. In fact, the quasiparticle levels 
of the localized Ga and As Sd semicore states in gallium 
arsenide lie 2.1 and 3.1 eV deeper than their LDA coun- 
terparts. Their theoretical binding energies are 16.9 and 
38.4 eV, respectively, which still underestimates the ex- 
perimental values of 18.82 and 40.76 eV from x-ray pho- 
toemission spectroscopy.^^ It has been shown that self- 
consistent calculations can further improve the d-band 
positions. "^^'^^ 

In Fig. [51 we show the local-spin-density approximation 
(LSDA) (Ref. IstI ) and GW band structures for ferromag- 
netic Ni. The self-energy correction was calculated with a 
8x8x8 sampling of the BZ. Convergence was tested with 
a 10 X 10 X 10 set. While for the semiconductors and insu- 
lators treated so far a model function [Eq. (|27l) ] for the 
correlation self-energy with three poles, i.e., A^p = 3, was 
sufficient, we must use a five-pole function in the case of 
Ni to reproduce the values that we get from the reference 



15 



Table II: Computational time for the calculation of quasi- 
particle shifts of diamond in the conventional (1x1x1) and 
supercell (2x2x2 and 4x4x4) geometries. For the former 
we also show corresponding timings without use of symmetry 
("No") and with restricted use ("Only IS" and "Only IBZ") as 
well as the effect of the MPB optimization with a threshold 

value Umin- 



Geometry 


Atoms 


k mesh 


Symmetry 


^min 


CPU time 


1x1x1 


2 


4x4x4 


No 




5 min 52 s 








Only IS 




3 min 34 s 








Only IBZ 




30 s 








Yes 




11 s 








Yes 


0.65 


5 s 


2x2x2 


16 


2x2x2 


Yes 


0.65 


14 min 15 s 


4x4x4 


128 


1x1x1 


Yes 


0.65 


34 h 11 min 



contour-integration method. A comparison of the LSDA 
and GW band structures shows that the self-energy cor- 
rection is strongly state and k dependent whereas in the 
case of materials with a band gap, the quasiparticle shifts 
are more or less uniform over the BZ but different for 
occupied and unoccupied states. The GW quasiparticle 
correction reduces the occupied d-band width from about 
4.0 eV in the LSDA to 3.2 eV, which is in accordance with 
x-ray photoemission experimentsi^ On the other hand, 
the exchange splitting is hardly improved. There is only a 
slight reduction, which cannot account for the large over- 
estimation within LSDA. The reason for this shortcoming 
is that the GW self-energy lacks two-particle vertex cor- 
rections, which give rise to spin-dependent screening and 
thus to a different correction for spin-up and spin-down 
states. Furthermore, the 6 eV satellite, which originates 
from a virtual bound two-hole excitation, cannot be de- 
scribed within the GW approximation for the same rea- 
son. Our GW band structure compares very well with 
an early work within the LAPW method^ but less well 
with a more recent LMTO calculation. '"'^ About this dis- 
crepancy we can only speculate. It might be attributed 
to a less accurate description of unoccupied states within 
the LMTO basis or to the usage of the offset-F method 
in Ref. ,56,, in which the numerically important region 
around the center of the BZ is treated only approxi- 
mately. 

In order to demonstrate the efficiency of the code, we 
show the computational time for calculating quasiparticle 
shifts for diamond in the conventional unit cell (1x1x1) 
containing two atoms as well as in 2 x 2 x 2 and 4x4x4 su- 
percell geometries containing 16 and 128 atoms, respec- 
tively. We choose the parameters so that all three calcu- 
lations yield identical results. For example, the k mesh 
contains 4x4x4, 2x2x2, and 1x1x1 points, respectively. 
The other parameters are determined to ensure conver- 
gence to within 0.01 eV. The computation times on a sin- 
gle CPU are given in Table HIl While the calculation of 
the quasiparticle shifts takes only 5 s for the conventional 



unit cell, even the treatment of supercells containing 16 
and 128 atoms only consumes affordable 0.24 and 34.2 h 
of computation time, respectively. 

In the case of the conventional unit cell (1x1x1), we 
also demonstrate the efficiency gain achieved by exploit- 
ing the symmetry according to Sec. IIII Gl and by using a 
threshold parameter t;inin as in Sec. IIIIFI If symmetry 
is not used at all, then the computation of the quasipar- 
ticle shifts takes nearly 6 min. The diamond structure 
exhibits inversion symmetry (IS), which allows to define 
the bare and screened Coulomb matrices as real sym- 
metric instead of complex Hermitian quantities after a 
symmetrization of the MT functions as briefly described 
in Sec. IIII B] (for more details, see Ref. [27l) . This reduces 
the computation time roughly by a factor of 2 ("Only 
IS"). If we next calculate the screened interaction only 
in the irreducible wedge of the BZ, i.e., at eight k points 
instead of 64, the computation time goes down further to 
30 s ("Only IBZ"). The gain is slightly less than a factor 
of 8 because the BZ summation in Eq. ((25|) must still be 
performed with all 64 k points. We can only restrict this 
summation and the sum over q in Eq. ([^^ if we use the 
extended IBZ (EIBZ) as explained in Sec. IIII Gl which 
leads to further time savings of a factor of 3. Compared 
to the first calculation, the usage of symmetry thus leads 
to a 32 times faster execution without loss of accuracy. 
By introducing a threshold parameter Umin = 0.65 for 
the optimization of the MPB, we can even reduce the 
computation time further to only 5 s, gaining an overall 
factor of 70. 



V. CONCLUSIONS 

In this paper, we described an implementation of the 
GW approximation for the electronic self-energy within 
the all-electron full-potential linearized augmented- 
plane-wave method. We employ a mixed product ba- 
sis, which is specifically designed for the representation 
of wave-function products and retains the full accuracy 
of the all-electron framework. As all-electron GW cal- 
culations have so far been prohibitive for large systems 
due to the computational cost, we presented ways to 
speed up the calculations considerably so that supercell 
calculations for defect systems, nanowires, interface, or 
surface structures become feasible. As a demonstration, 
we showed that our computer code can treat 128 carbon 
atoms in a diamond supercell. This was achieved by ex- 
ploiting spatial and time-reversal symmetries in the eval- 
uation of the polarization function and the self-energy. 
Both quantities exhibit a k dependence and also involve a 
BZ summation. While we only need to consider k points 
in the IBZ for the former, the latter can be restricted to 
an EIBZ, which accelerates the code considerably. If the 
system exhibits inversion symmetry, a symmetrization of 
the MT part of the mixed product basis leads to real 
symmetric instead of complex Hermitian response matri- 
ces, which reduces the CPU time and memory demand. 



16 



Furthermore, for the correlation part of the self-energy 
we can apply an optimization of the mixed product ba- 
sis that involves a basis transformation to the eigenvec- 
tors of the Coulomb matrix. By neglecting eigenvectors 
with eigenvalues below a certain threshold value we only 
retain the dominant part of the bare electron-electron 
interaction. The threshold value then becomes a conver- 
gence parameter. This optimization reduces the matrix 
sizes of response quantities such as the screened inter- 
action, again giving rise to a speed up of the calcula- 
tion. We note that no further approximations such as 
plasmon-pole models or a range separation of the inter- 
action potential are introduced, and the anisotropy of 
the screening at k = is fully taken into account. The 
divergence of the bare and the screened interaction po- 
tential in the limit k — >■ is treated analytically while 
zeroth-order correction are derived with the help of k • p 
perturbation theory. This procedure gives rise to a fast 
k-point convergence, which is particularly important for 
GW calculations. 

We showed convergence tests for silicon and strontium 
titanate as a prototype semiconductor and transition- 
metal oxide, respectively, to illustrate the accuracy of the 
mixed product basis and its optimization with a thresh- 
old value for the Coulomb eigenvalues. The results al- 
ready converge with relatively modest parameters. For 
example, for the angular momenta inside the MT spheres 
and the plane- wave representation in the IR, cutoff values 
well below the exact limit (i.e.. twice the corresponding 
FLAPW cutoffs) are sufficient for convergence of the gap 
energies to within 0.01 eV. In fact, the cutoff values can 



even be chosen smaller than the FLAPW cutoffs. For ref- 
erence, we reported the fundamental GW band gaps for a 
variety of semiconductors and insulators. Our results are 
in good agreement with recent GW calculations based on 
the PAW method and with experiments, although there 
is a somewhat larger discrepancy with older GW results 
obtained within the LMTO method. For ferromagnetic 
Ni, we find that the GW self-energy reduces the c?-band 
width from 4.0 to 3.2 eV in very good agreement with 
experiment, but hardly improves the overestimation of 
the exchange splitting within LSDA. These results are in 
accordance with previous calculations. 

For simplicity, we have restricted ourselves to the non- 
self-consistent approach. However, with the numeri- 
cal procedure presented here we are prepared to follow 
Ref. [sil and extend the method to the quasiparticle self- 
consistent scheme. Within this approach, the full self- 
energy matrix including off-diagonal elements is needed. 
The extension of the numerical procedure developed in 
the present paper to these elements is straightforward. 
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